Move cpu0_stack out of Xen text section and into BSS. This
avoids getting loads of bogus cpu0_stack lines in call
backtraces from non-debug builds.
Doing this requires greater alignment of the BSS section,
which reveals a bug in ld where the alignment padding is
not added to the program segment's memsz field. We get around
this by finding the address of the last symbol in the image,
and increasing our load image's memsz to include that symbol.
Also some cleanups to the linker scripts.
Signed-off-by: Keir Fraser <keir@xensource.com>